Using os/exec to Automate Local Changes

Understand how to automate local changes in Go by executing binaries using the exec package.

Automating the execution of tools that are local to the machine can provide a series of benefits to end users. The first of these is that it can reduce the toil that our team experiences. One of the primary goals for DevOps and Site Reliability Engineers (SRE) is to remove repetitive, manual processes. That time can be put to better use by reading a good book, organizing a sock drawer, or working on the next problem. The second benefit is to remove manual mistakes from a process.

widget

It is easy to type the wrong thing or copy and paste something incorrectly. And finally, it is the core underpinning of operating at scale. Automating locally can be combined with other techniques detailed in the book to make changes at a large scale.

The automation life cycle generally comes in three stages, moving from manually doing work to automation, as follows:

  1. The first stage revolves around the manual execution of commands by an experienced engineer. While this is not automation itself, this starts a cycle that ends with some type of automation.

  2. The second stage usually revolves around writing those stages down in order to document the procedure, to allow more than one person to share the workload. This might be a method of procedure (MOP) document, though more commonly, it is a bunch of notes that we spend an hour looking for. We highly recommend a central place to store these such as a wiki or markdown in a git repository.

  3. The third stage is usually a script to make the task repeatable.

Once a company gets larger, these stages are usually condensed into developing a service to handle the task in a fully automated way when a need for it is identified. A good example of this might be deploying pods on a Kubernetes cluster or adding a new pod configuration to our Kubernetes config. These are driven by calling command-line applications such as kubectl and git.

These types of jobs start manually; eventually, they are documented and finally automated in some way. At some point, this might move into a continuous integration/continuous deployment (CI/CD) system that handles this for us.

The key to automating tooling locally is the os/exec package. This package allows for the execution of other tools and control of their STDIN/STDOUT/STDERR streams.

Let's take a closer look.

Determining the availability of essential tools#

When writing an application that calls other applications on a system, it is critical to determine if the tools needed are available on the system before we start executing commands. Nothing is worse than being partway through a procedure to find that a critical tool is missing.

The exec package provides the LookPath() function to help determine if a binary exists. If only the name of the binary is provided, the PATH environmental variable is consulted, and those paths will be searched for the binary. If a / is in the name, only that path will be consulted.

Let's say we are writing a tool that needs both kubectl and git to be installed in order to work. We can test if those tools are available in our PATH variable by executing the following code:

Testing if the tools are in our PATH

This code does the following:

  • Lines 9–12: Defines constants for our binary names.

  • Line 13: Uses LookPath() to determine if these binaries exist in our PATH variable.

In this code, we print an error if we do not find the tool. There are other options, such as attempting to install these tools with the local package manager. Depending on the makeup of our fleet, we might want to test which version is deployed and only proceed if we are at a compatible version.

Let's look at using the exec.CommandContext type to call binaries.

Executing binaries with the exec package#

The exec package allows us to execute a binary using the exec.Cmd type. To create one of these, we can use the exec.CommandContext() constructor. This takes in the name of the binary to execute and the arguments to the binary, as illustrated in the following code snippet:

Executing the arguments in the binary

This creates a command that will run the kubectl tool's apply function and tell it to apply the configuration of the path stored in the config variable.

Does this command seem to have a familiar syntax? It should! kubectl is written using Cobra from our last section!

We could execute this command using several different methods on cmd, as follows:

  • .CombinedOutput(): Runs the command and returns the combined output of STDOUT/STDERR.

  • .Output(): Runs the command and returns the output of STDOUT.

  • .Run(): Runs the program and waits for it to exit. It returns an error on any issues.

  • .Start(): Runs the command but doesn't block. Used when we want to interact with the command as it runs.

.CombinedOuput() and .Output() are the most common ways to start a program. The output that a user sees in the terminal can often be both from STDOUT and STDERR. Choosing which one of these to use depends on how we want to respond to the program's output.

.Run() is used when we only need to know the exit status and do not require any of the output.

There are two main reasons to use .Start(), as outlined here:

  • There is a need to respond on STDIN to output on STDOUT.

  • The program execution takes a while, and we want to output its content to our screen, instead of waiting for the program to complete.

If we need to respond on STDIN to a program's output, using Google's goexpect package package or Netflix's go-expect package is probably a better choice. These packages continue the proud tradition of porting the abilities of the Tool Command Language (TCL) Expect extension to other languages.

Let's write a simple program that tests our ability to log in to hosts on a subnet. We'll use the ping utility and the ssh client programs to test connectivity. We'll be relying on our host to recognize our SSH key (we are not using password authentication here, as that is more complicated). Finally, we'll use uname on the remote machine to determine the OS. The code is illustrated in the following snippet:

hostAlive function

Note: uname is a program found on Unix-like systems that will display information about the current OS and the hardware it runs on. Only Linux and Darwin machines are likely to have uname. As SSH is just a connection protocol, we may just get an error. Also, a given Linux distribution might not have uname installed. There can be subtle differences between versions of common utilities on similar platforms. Linux ping and OS X ping utilities share some flags, but also have different flags. Windows often has completely different utilities for accomplishing the same tasks. If we are trying to support all platforms with a tool that uses exec, we'll need either build constraints or to use the runtime package to run different utilities on different platforms.

This code does the following:

  • Line 2: Creates a *Cmd that pings a host.

    • -c 1 sends a single Internet Control Message Protocol (ICMP) packet.

    • -t 2 causes a timeout after 2 seconds.

  • Lines 3–6: Runs the command.

    • If there is an error, the ping was unsuccessful.

    • Otherwise, the host responded to the ping.

Let's now use the ssh utility to send a command to be run on the remote machine, as follows:

Sending a command to be run on the remote machine

This code does the following:

  • Line 4: Sets a timeout of 5 seconds, if ctx has none

  • Line 7: Creates a user@host login line

  • Lines 8–15: Creates a *CMD that issues the command: ssh user@host "uname -a"

    • The StrictHostKeyChecking option automatically adds host keys.

    • The BatchMode option prevents asking for passwords.

  • Lines 16–20: Runs the command and captures the output from STDOUT

    • If successful, it runs uname -a and returns the output.

    • The host must have the user's SSH key for this to work.

    • Password authentication requires either the sshpass utility or an Expect package.

We need a type to store the data we gather. Let's create that, as follows:

The record type

Now, we need some code to take a channel containing Internet Protocol (IP) addresses that need to be scanned. We want to do this in parallel, so we'll be using goroutines, as illustrated in the following code snippet:

The scanPrefixes function

This code does the following:

  • Line 1: Takes in a channel of net.IP

  • Line 2: Creates a channel to put records on

  • Lines 3–23: Spins off a goroutine to do all the scanning

    • Line 4: Defers closure of our output channel.

    • Lines 7–21: Loops through all IPs on the incoming channel.

    • Line 8: Uses the limit channel to limit 100 pings concurrently.

    • Lines 10–20: Spins a goroutine for each ping.

      • Line 11: Decrements the limiter when we finish.

      • Line 13: Makes a timeout of 2 seconds for our ping.

      • Line 16: Calls our hostAlive() function.

      • Line 19: Outputs the result on our ch output channel. Waits for all pings to finish with WaitGroup.

  • Line 24: Returns the channel

We now have a function that will asynchronously ping hosts in parallel and put the result on a channel.

Our ssh function has a similar function signature to scanPrefixes, as we can see here:

The ssh function signature

For brevity, we are not going to include the code here, but you can see it in the terminal at the end of this section.

These are the big differences between scanPrefixes() and unamePrefixes():

  • We receive a channel of record, the output of scanPrefixes()

  • If rec.Reachable is false, we put rec on the output channel without adding OS information to the fields

  • Otherwise, we call runUname() instead of hostAlive()

Now, let's set up our main() function, as follows:

The main function

This code does the following:

  • Lines 2–9: Checks that our binaries exist in the path

  • Line 10–12: Checks we have the correct number of arguments, which is 1

    • We check that len(os.Args) == 2 because the first argument is the binary name.

  • Line 13: Retrieves a channel of IPs in the network passed in the argument

    • The implementation of the hosts() function is not detailed here, but we'll find it in the repository.

  • Line 17: Gets the current user's login name

Now, we need to scan our prefixes and concurrently process the results by doing a login and retrieving the uname output, as follows:

Continued main function

This code does the following:

  • Sends scanPrefixes() a channel of IPs

  • Line 2: Receives the results on scanResults

  • Line 3: Sends the channel of results to unamePrefixes()

  • Line 6: Prints the JSON results to STDOUT

The key to this code is the channel read in the for range loops in scanPrefixes() and unamePrefixes(). When all IPs have been sent, ipCh will be closed. That will stop our for range loop in scanPrefixes(), which will cause its output channel to close. That causes unamePrefixes to see the closure and close its output channel. This will in turn close our for rec := range unameResults loop and stop printing. Using this chaining concurrency model, we'll be scanning up to 100 IPs while SSHing into a maximum of 100 hosts and printing the results to the screen, all at the same time.

We have stored the output of uname -a in our record variable but in an unparsed format. We could use a lexer/parser or regular expression (regex) to parse that into a struct. If we need to use the output of an executed binary, we recommend finding tools that can output in a structured format such as JSON instead of parsing it ourselves.

The code can be seen below:

/
scanner
scanner.go
The executable scanner code

As an argument, the code segment shown above requires the network CIDR to scan. In our case, the command that runs in the terminal above is as follows:

Notes on using the exec package#

There are some things we should look out for when using exec. One of the big ones is if the binary being invoked takes control of the terminal. ssh does this, for example, to get a password from the user. We suppressed this in our example, but when this happens, it bypasses the normal STDOUT we are reading.

This happens when someone uses terminal mode. In those cases, we'll want to use goexpect or go-expect if we must deal with it. Generally, this is something where we want to find alternatives. However, some software and various routing equipment will implement menu-driven systems and use terminal modes that cannot be avoided.

In this lesson, we have talked about automating the command line with the exec package. We now have the skills to check for binaries on the system and execute those binaries. We can check the error condition and retrieve the output.

Note: In general, always use a package instead of a binary when available. This keeps system dependencies low and makes code more portable.

Introduction

Using SSH in Go to Automate Remote Changes